Skip to content

feat: Introduce tiered timeout system with per-endpoint configuration#653

Open
vdusek wants to merge 7 commits intomasterfrom
worktree-fix-timeouts
Open

feat: Introduce tiered timeout system with per-endpoint configuration#653
vdusek wants to merge 7 commits intomasterfrom
worktree-fix-timeouts

Conversation

@vdusek
Copy link
Contributor

@vdusek vdusek commented Mar 2, 2026

Summary

  • Introduce three (four with max) configurable timeout tiers:
    • short (5s by default),
    • medium (30s by default),
    • long (360s by default),
    • and max (360s cap by default).
  • Introduce Timeout type alias: timedelta | Literal['no_timeout', 'short', 'medium', 'long']
  • Keep the exponential timeout growth per retry attempt (doubles each retry, capped at timeout_max).
  • The default timeout tiers were kept as they were (let's adjust them later in a dedicated PR).
  • Reduce default max retries from 8 to 4 (5 attempts by default).
  • Rename _internal_models.py to _types.py and consolidate type aliases (Timeout, JsonSerializable) into it.
  • Remove old timeout constants (FAST_OPERATION_TIMEOUT, STANDARD_OPERATION_TIMEOUT).

Timeout tiers

Tier Default Usage
short 5s 16 methods (only in storage clients)
medium 30s 7 methods (only in storage clients)
long 360s 150+ (all the others)
no_timeout It is not used by default
max 360s Cap for exponential timeout growth across retries

Let's discuss the best defaults later in a dedicated PR.

What this solves

  • Before this change, timeout handling across the client was fragmented and inconsistent in three major ways:
    • Some resource clients had hardcoded internal timeouts.
    • Users had no way to control per-request timeouts.
    • The timeout parameter name was ambiguous on Actor/Task run methods.

Test plan

  • CI passes

@vdusek vdusek added adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team. labels Mar 2, 2026
@vdusek vdusek self-assigned this Mar 2, 2026
@github-actions github-actions bot added this to the 135th sprint - Tooling team milestone Mar 2, 2026
@codecov
Copy link

codecov bot commented Mar 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.74%. Comparing base (841d760) to head (9709a34).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #653      +/-   ##
==========================================
+ Coverage   96.57%   96.74%   +0.17%     
==========================================
  Files          45       45              
  Lines        4318     4336      +18     
==========================================
+ Hits         4170     4195      +25     
+ Misses        148      141       -7     
Flag Coverage Δ
integration 94.69% <99.34%> (-0.03%) ⬇️
unit 78.27% <69.05%> (+0.22%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vdusek vdusek force-pushed the worktree-fix-timeouts branch from 9653d95 to 634577c Compare March 2, 2026 13:05
- Add `Timeout` type alias (`timedelta | Literal['no_timeout'] | None`)
  to `_consts.py` and export it from the public API
- Accept `None` on abstract `HttpClient.call()` / `HttpClientAsync.call()`
  so the HTTP client resolves its own default timeout internally
- Move `_calculate_timeout` into the impit HTTP client implementation
  since exponential timeout growth on retries is client-specific logic
- Add `timeout` parameter to all public resource client methods for
  user-controllable per-request HTTP timeouts
- Add timeout parameter to base methods `_list()`, `_create()`, and
  `_get_or_create()` which previously lacked it
- Rename domain-specific Actor/Task run timeout from `timeout` to
  `run_timeout` to avoid ambiguity with the HTTP request timeout
- Update docstrings to document three-state timeout semantics

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vdusek vdusek requested a review from Pijukatel March 2, 2026 13:22
@vdusek vdusek marked this pull request as ready for review March 2, 2026 13:23
@vdusek vdusek changed the title refactor: Standardize timeout handling across all resource clients feat: Standardize timeout handling with per-endpoint Timeout type alias Mar 2, 2026
@vdusek vdusek changed the title feat: Standardize timeout handling with per-endpoint Timeout type alias feat: Add per-endpoint timeout configuration for all resource clients Mar 2, 2026
@vdusek vdusek marked this pull request as draft March 2, 2026 14:39
Introduce `timeout_max` parameter to cap exponential timeout growth
independently from the initial timeout. Lower defaults to
`DEFAULT_REQUEST_TIMEOUT=10s`, `DEFAULT_REQUEST_TIMEOUT_MAX=600s`,
and `DEFAULT_MAX_RETRIES=4`, producing the sequence [10, 20, 40, 80, 160].

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vdusek vdusek force-pushed the worktree-fix-timeouts branch from 634577c to ac20367 Compare March 2, 2026 14:57
Copy link
Contributor

@Pijukatel Pijukatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be very careful about changing the actual timeouts. Can we please split this into 2 PRs:

  • Refactoring (majority)
  • Assigning different timeouts to API operations

@vdusek vdusek force-pushed the worktree-fix-timeouts branch from 5071549 to 3d9f230 Compare March 3, 2026 15:04
@vdusek vdusek changed the title feat: Add per-endpoint timeout configuration for all resource clients feat: Introduce tiered timeout system with per-endpoint configuration Mar 3, 2026
@vdusek vdusek force-pushed the worktree-fix-timeouts branch from 3d9f230 to 0ad2c46 Compare March 3, 2026 15:21
@vdusek vdusek marked this pull request as ready for review March 3, 2026 15:21
@vdusek vdusek marked this pull request as draft March 3, 2026 15:22
@vdusek vdusek requested a review from Pijukatel March 4, 2026 09:41
@vdusek vdusek marked this pull request as ready for review March 4, 2026 09:41
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a tiered timeout system across the Apify Python client, enabling per-endpoint timeout configuration via a shared Timeout type and centralized timeout resolution in HTTP clients.

Changes:

  • Added timeout tiers (short/medium/long + no_timeout) and propagated a timeout parameter through most resource-client methods.
  • Updated HTTP client internals to resolve timeout tiers and apply exponential per-attempt timeout growth capped by timeout_max.
  • Updated tests/docs for the new timeout API and reduced default retries (DEFAULT_MAX_RETRIES 8 → 4).

Reviewed changes

Copilot reviewed 41 out of 41 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/unit/test_pluggable_http_client.py Updates fake custom HTTP clients and tests to use Timeout and pass per-call timeout tiers.
tests/unit/test_http_clients.py Adjusts base HTTP client init test to the new tiered timeout fields.
tests/unit/test_client_timeouts.py Adds unit coverage for tier resolution, no_timeout, and timeout growth/capping.
tests/unit/test_actor_start_params.py Updates Actor start tests to use run_timeout instead of timeout.
tests/integration/test_webhook.py Fixes integration test type expectation for Actor call result.
src/apify_client/_types.py Introduces Timeout and consolidates JsonSerializable plus minimal polling models.
src/apify_client/_resource_clients/webhook_dispatch_collection.py Adds timeout parameter to list operations and forwards to _list.
src/apify_client/_resource_clients/webhook_dispatch.py Adds timeout parameter to get operations and forwards to _get.
src/apify_client/_resource_clients/webhook_collection.py Adds timeout parameter to list/create and forwards to _list/_create.
src/apify_client/_resource_clients/webhook.py Adds timeout parameter to CRUD/test methods and forwards to HTTP calls.
src/apify_client/_resource_clients/user.py Adds timeout parameter to user endpoints and forwards to _get/_http_client.call.
src/apify_client/_resource_clients/task_collection.py Adds request timeout and renames run timeoutrun_timeout for create.
src/apify_client/_resource_clients/task.py Adds request timeout broadly; renames run timeoutrun_timeout for start/call/update.
src/apify_client/_resource_clients/store_collection.py Adds timeout parameter to list and forwards to _list.
src/apify_client/_resource_clients/schedule_collection.py Adds timeout parameter to list/create and forwards to _list/_create.
src/apify_client/_resource_clients/schedule.py Adds timeout parameter to schedule operations and forwards to _get/_update/_delete.
src/apify_client/_resource_clients/run_collection.py Adds timeout parameter to run listing and forwards to _list.
src/apify_client/_resource_clients/run.py Adds request timeout widely; renames run timeoutrun_timeout where applicable; defaults waiters to no_timeout.
src/apify_client/_resource_clients/request_queue_collection.py Adds timeout parameter to list/get_or_create and forwards to _list/_get_or_create.
src/apify_client/_resource_clients/request_queue.py Replaces old fixed constants with tiered timeout parameters for queue operations.
src/apify_client/_resource_clients/log.py Adds timeout parameter to log get/stream methods and forwards to HTTP calls.
src/apify_client/_resource_clients/key_value_store_collection.py Adds timeout parameter to list/get_or_create and forwards to _list/_get_or_create.
src/apify_client/_resource_clients/key_value_store.py Replaces old fixed constants with tiered timeout parameters across KVS operations.
src/apify_client/_resource_clients/dataset_collection.py Adds timeout parameter to list/get_or_create and forwards to _list/_get_or_create.
src/apify_client/_resource_clients/dataset.py Replaces old fixed constants with tiered timeout parameters across dataset operations.
src/apify_client/_resource_clients/build_collection.py Adds timeout parameter to list and forwards to _list.
src/apify_client/_resource_clients/build.py Adds timeout parameter to build operations and forwards to _get/_delete/_wait_for_finish.
src/apify_client/_resource_clients/actor_version_collection.py Adds timeout parameter to list/create and forwards to _list/_create.
src/apify_client/_resource_clients/actor_version.py Adds timeout parameter to get/update/delete and forwards to _get/_update/_delete.
src/apify_client/_resource_clients/actor_env_var_collection.py Adds timeout parameter to list/create and forwards to _list/_create.
src/apify_client/_resource_clients/actor_env_var.py Adds timeout parameter to get/update/delete and forwards to _get/_update/_delete.
src/apify_client/_resource_clients/actor_collection.py Adds timeout parameter to list/create and forwards to _list/_create.
src/apify_client/_resource_clients/actor.py Adds request timeout widely; renames run timeoutrun_timeout for start/call.
src/apify_client/_resource_clients/_resource_client.py Makes internal CRUD/list/create helpers require timeout and forwards it to HTTP client calls.
src/apify_client/_http_clients/_impit.py Updates Impit clients to use tiered timeouts and per-attempt computed timeouts.
src/apify_client/_http_clients/_base.py Implements _compute_timeout() tier resolution + exponential growth capped by timeout_max; updates call signatures to Timeout.
src/apify_client/_consts.py Replaces single default timeout + old operation constants with tier defaults and DEFAULT_TIMEOUT_MAX; lowers DEFAULT_MAX_RETRIES.
src/apify_client/_apify_client.py Changes client constructors to accept tiered timeouts and passes them into default HTTP clients.
src/apify_client/init.py Exports Timeout from the package top-level.
docs/02_concepts/code/05_retries_sync.py Updates docs example to use new retry/timeout configuration arguments.
docs/02_concepts/code/05_retries_async.py Updates docs example to use new retry/timeout configuration arguments.
Comments suppressed due to low confidence (1)

src/apify_client/_types.py:20

  • The JsonSerializable docstring references json.parse, which doesn’t exist in Python’s stdlib (json.loads/json.dumps are the relevant APIs). Consider correcting the reference and removing the anecdotal claim about approval, so the type alias docs stay accurate and focused.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 67 to +71
token: str | None = None,
timeout: timedelta = DEFAULT_TIMEOUT,
timeout_short: timedelta = DEFAULT_TIMEOUT_SHORT,
timeout_medium: timedelta = DEFAULT_TIMEOUT_MEDIUM,
timeout_long: timedelta = DEFAULT_TIMEOUT_LONG,
timeout_max: timedelta = DEFAULT_TIMEOUT_MAX,
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImpitHttpClient.__init__ dropped the old timeout= argument in favor of tiered timeouts. Since ImpitHttpClient is part of the public API, this is a breaking change; consider supporting timeout as a deprecated alias (e.g., set timeout_long and/or all tiers) to keep existing user code working until the next major version.

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +39
def test_no_timeout_passes_none_to_impit_sync(patch_request: list) -> None:
"""Test that `'no_timeout'` passes `timeout=None` to impit (uses client-level default)."""
client = ImpitHttpClient(timeout_short=timedelta(seconds=10))
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests state that 'no_timeout' passes timeout=None to impit “(uses client-level default)”, but the HTTP client docs describe 'no_timeout' as disabling the timeout entirely. Please reword the docstrings to match the intended semantics (disable timeout / pass-through None) to avoid confusion.

Copilot uses AI. Check for mistakes.
params=request_params,
json=list(request_batch),
timeout=STANDARD_OPERATION_TIMEOUT,
timeout='medium',
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batch_add_requests(..., timeout=...) exposes a timeout parameter but the worker call is hard-coded to timeout='medium', so the user-provided timeout is ignored. Thread the timeout argument through to _batch_add_requests_worker (or into request_params) and pass it to _http_client.call() to make the API behave as documented.

Suggested change
timeout='medium',
timeout=request_params.get('timeout', 'medium'),

Copilot uses AI. Check for mistakes.
Comment on lines 91 to 99
def __init__(
self,
*,
token: str | None = None,
timeout: timedelta = DEFAULT_TIMEOUT,
timeout_short: timedelta = DEFAULT_TIMEOUT_SHORT,
timeout_medium: timedelta = DEFAULT_TIMEOUT_MEDIUM,
timeout_long: timedelta = DEFAULT_TIMEOUT_LONG,
timeout_max: timedelta = DEFAULT_TIMEOUT_MAX,
max_retries: int = DEFAULT_MAX_RETRIES,
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HttpClientBase.__init__ removed the previously supported timeout parameter. Since HttpClient/HttpClientAsync are public ABCs intended for subclassing, this is a breaking change for custom clients that call super().__init__(timeout=...). Consider keeping timeout as a deprecated keyword-only alias (e.g., mapping it to timeout_long and/or setting all tiers) to preserve backward compatibility until the next major release.

Copilot uses AI. Check for mistakes.
Comment on lines 119 to +123
min_delay_between_retries: timedelta = DEFAULT_MIN_DELAY_BETWEEN_RETRIES,
timeout: timedelta = DEFAULT_TIMEOUT,
timeout_short: timedelta = DEFAULT_TIMEOUT_SHORT,
timeout_medium: timedelta = DEFAULT_TIMEOUT_MEDIUM,
timeout_long: timedelta = DEFAULT_TIMEOUT_LONG,
timeout_max: timedelta = DEFAULT_TIMEOUT_MAX,
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ApifyClient.__init__ no longer accepts the previous timeout parameter, which is a breaking change for existing users on a 2.x release line. Consider re-introducing timeout as a deprecated keyword-only argument (e.g., mapping to timeout_long / setting all tiers) and emitting a DeprecationWarning to allow a gradual migration.

Copilot uses AI. Check for mistakes.
Comment on lines 472 to +476
min_delay_between_retries: timedelta = DEFAULT_MIN_DELAY_BETWEEN_RETRIES,
timeout: timedelta = DEFAULT_TIMEOUT,
timeout_short: timedelta = DEFAULT_TIMEOUT_SHORT,
timeout_medium: timedelta = DEFAULT_TIMEOUT_MEDIUM,
timeout_long: timedelta = DEFAULT_TIMEOUT_LONG,
timeout_max: timedelta = DEFAULT_TIMEOUT_MAX,
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ApifyClientAsync.__init__ no longer accepts the previous timeout parameter, which is a breaking change for existing users on a 2.x release line. Consider re-introducing timeout as a deprecated keyword-only argument (e.g., mapping to timeout_long / setting all tiers) and emitting a DeprecationWarning to allow a gradual migration.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@Pijukatel Pijukatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is good if we want to expose the timeouts on the API client method level as well. But are there any usecases for it?

Users already have the option to change to some extent the timeout behavior by giving their own HTTP client - like client that inherits from ImpitHttpClient and just modifies the timeout-based behavior.

headers=headers,
content=content,
timeout=self._calculate_timeout(attempt, timeout),
timeout=self._compute_timeout(timeout, attempt),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity: What will be the behavior of impit when timeout is None ? Is it some impit default or no timeout at all?

restart_on_error: bool | None = None,
memory_mbytes: int | None = None,
timeout: timedelta | None = None,
run_timeout: timedelta | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this, but it makes this change breaking or not?

@Pijukatel Pijukatel requested a review from janbuchar March 4, 2026 13:39
@Pijukatel
Copy link
Contributor

@janbuchar could you please take a look, as such refactoring will probably make its way to JS version as well. It would be for the best to comment on any changes now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants